課程資訊
課程名稱
巨量資料統計與探勘
Big Data Statistics and Mining 
開課學期
102-1 
授課對象
電機資訊學院  生醫電子與資訊學研究所  
授課教師
歐陽彥正 
課號
CSIE7120 
課程識別碼
922 U4130 
班次
 
學分
全/半年
半年 
必/選修
選修 
上課時間
星期五3,4,@(10:20~) 
上課地點
資101 
備註
限學士班三年級以上
總人數上限:50人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/1021bdsm 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

This course covers the fundamental theory of statistical analysis and mining for handling big data. 

課程目標
The students enrolled in this course will learn statistical and data mining
approaches featuring low time complexity and therefore offering significant
advantages when exploited to tackle big data. Furthermore, the students will
develop valuable insights for tackling big data.

Statistical analysis of big data (9 weeks)
* Review of probability and statistics
* Convergence rates of statistical estimators
* Challenges introduced by high dimensionality
* Statistical approaches with low time-complexity

Data mining of big data (9 weeks)
* Supervised learning algorithms
* Unsupervised learning algorithms
* Regression and function approximation with the regularization networks
* Optimization algorithms 
課程要求
 
預期每週課後學習時數
 
Office Hours
 
指定閱讀
 
參考書目
Reference:
Probability and Statistical Inference, Hogg, R.V. and E.A. Tanis  
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
Week 1
9/13  Introduction 
Week 2
9/20  中秋節放假 
Week 3
9/27  Continuous Distribution 
Week 4
10/04  Parametric estimation 
Week 5
10/11  Kernel density estimation 
Week 6
10/18  Kernel density estimation 
Week 9
11/08  midterm exam 
Week 11
11/15  Feature Selection 
Week 14
12/13  Clustering 
Week 15
12/20  Optimization.web &
Microarray overfit 
Week 17
1/03  Decision Tree